Identifying RNA-binding residues based on evolutionary conserved structural and energetic features

نویسندگان

  • Yao Chi Chen
  • Karen Sargsyan
  • Jon D. Wright
  • Yi-Shuian Huang
  • Carmay Lim
چکیده

Increasing numbers of protein structures are solved each year, but many of these structures belong to proteins whose sequences are homologous to sequences in the Protein Data Bank. Nevertheless, the structures of homologous proteins belonging to the same family contain useful information because functionally important residues are expected to preserve physico-chemical, structural and energetic features. This information forms the basis of our method, which detects RNA-binding residues of a given RNA-binding protein as those residues that preserve physico-chemical, structural and energetic features in its homologs. Tests on 81 RNA-bound and 35 RNA-free protein structures showed that our method yields a higher fraction of true RNA-binding residues (higher precision) than two structure-based and two sequence-based machine-learning methods. Because the method requires no training data set and has no parameters, its precision does not degrade when applied to 'novel' protein sequences unlike methods that are parameterized for a given training data set. It was used to predict the 'unknown' RNA-binding residues in the C-terminal RNA-binding domain of human CPEB3. The two predicted residues, F430 and F474, were experimentally verified to bind RNA, in particular F430, whose mutation to alanine or asparagine nearly abolished RNA binding. The method has been implemented in a webserver called DR_bind1, which is freely available with no login requirement at http://drbind.limlab.ibms.sinica.edu.tw.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probing binding hot spots at protein-RNA recognition sites.

We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein-RNA interfaces to probe the binding hot spots at protein-RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structur...

متن کامل

Structural stabilization of GTP-binding domains in circularly permuted GTPases: Implications for RNA binding

GTP hydrolysis by GTPases requires crucial residues embedded in a conserved G-domain as sequence motifs G1-G5. However, in some of the recently identified GTPases, the motif order is circularly permuted. All possible circular permutations were identified after artificially permuting the classical GTPases and subjecting them to profile Hidden Markov Model searches. This revealed G4-G5-G1-G2-G3 a...

متن کامل

Network analysis of protein structures identifies functional residues.

Identifying active site residues strictly from protein three-dimensional structure is a difficult task, especially for proteins that have few or no homologues. We transformed protein structures into residue interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions with each other are the graph edges. We found that active site, ligand-binding and evolutionary co...

متن کامل

Protein–DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins

Amino acid residues, which play important roles in protein function, are often conserved. Here, we analyze thermodynamic and structural data of protein-DNA interactions to explore a relationship between free energy, sequence conservation and structural cooperativity. We observe that the most stabilizing residues or putative hotspots are those which occur as clusters of conserved residues. The h...

متن کامل

Quantifying sequence and structural features of protein–RNA interactions

Increasing awareness of the importance of protein-RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2014